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1.  Introduction  and  Summary.  A  discrete  time  stationary  stochastic  process  is  said  to  be 
long  range  dependent  if  its  covariances  decrease  to  zero  like  a  power  of  lag  as  the  lag  tends 
to  infinity  but  their  absolute  sum  diverges.  Such  processes  arise  in  applications  in 
Hydrology,  Economics,  Time  Series  Analysis  and  other  sciences.  See,  e.g.,  the  review  paper 
by  Mandelbrot  and  Taqqu  (1979)  and  references  therein  for  the  importance  of  these 
processes.  See  Granger  and  Joyeux  (1980),  and  Hosking  (1981)  for  the  usefulness  of  these 
processes  in  Economics  and  Time  Series  Analysis.  For  many  technical  results  on  these 
processes,  see  Taqqu  (1975,  1979),  Fox  and  Taqqu  (1987)  and  Dehling  and  Taqqu  (1989), 
and  Yajima  (1985,  1988),  among  others. 

One  of  the  popular  class  of  estimators  in  linear  models  that  has  evolved  over  the  last 
two  and  a  half  decades  is  the  so  called  class  of  M  -  estimators.  Most  of  the  asymptotic 
literature  on  these  estimators  assumes  either  independent  errors  (Huber:  1981  and 
references  therein)  or  weakly  dependent  errors,  like  strongly  mixing,  as  in  Koul  (1977). 

Because  of  the  importance  of  both,  M  -  estimators  and  the  long  range  dependence,  it 
is  of  interest  to  study  the  large  sample  behavior  of  these  estimators  in  a  linear  regression 
setting  when  errors  are  either  long  range  dependent  Gaussian  or  functions  of  such  random 
variables  (r.v.'s).  About  the  design  variables  in  the  linear  model  we  shall  assume  that  they 
are  either  r.v.'s  or  known  constants.  In  the  former  case  it  will  be  further  assumed  that  the 
design  variables  are  independent  of  the  errors  and  either  i.i.d.  or  long  range  dependent. 

The  case  of  the  known  constant  designs  will  be  discussed  in  Section  3.  We  shall  for  the 
time  being  restrict  our  attention  to  the  case  of  random  designs. 

Accordingly,  let  7?p  rj y  -  be  a  sequence  of  strictly  stationary  mean  zero  unit 

p  be  a  sequence  of 
observable  p*l  stationary  mean  zero  random  vectors  with  T(k)  :=  k  >  0. 

Consider  the  linear  model 

d)  Yi  =  xi>+(j,  x:  =  (i, e- ),  i>i 


variance  Gaussian  r.v.'s  with  p(k)  :=  E^^^p  k  >  0.  Let  ( 
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where  fj  =  G(^),  i  >  1,  G  a  measurable  function  form  Si  to  Si. 

Note  that  the  marginal  distribution  of  need  not  be  Gaussian.  In  fact  if  one  were  to 
have  a  linear  regression  model  with  stationary  errors  whose  marginal  distribution  function 
(d.f.)  is  F,  then  choosing  G  =  F  *($)  would  yield  the  desired  errors.  Here  $  is  the  d.f.  of  a 
N(0,1)  r.v.  and  F-1(u)  =  inf{x;  F(x)  >  u},  0  <  u  <  1. 

The  class  of  M  -  estimators,  one  corresponding  to  each  ip,  is  defined  as  a  solution 
of  the  equation 

N 

(2)  S(t)  :=  E  X.ip(Y.-X.  t)  =  0 

i=l  11 

where  ip  is  a  measurable  function  from  Si  to  Si  with 

(3)  E ip{()  =  0,  0  <  E /(f)  <  oo. 

Here,  and  in  the  sequel,  r),  e,  (  etc.  are  copies  of  rj p  ^  etc.  Also  for  a  p*l 

vector  t  e  sP,  t  will  denote  its  transpose  and  ||tj|  will  stand  for  its  Euclidean  norm. 

The  present  paper  is  concerned  with  investigating  the  large  sample  behavior  of  M  - 
estimators  when  the  r.v.'s  { r ^},  in  addition,  satisfy 

-0, 

(4)  p(k)  =  k  1  L^k),  0<D1<l,k>I 

where  L^k)  is  positive  for  large  k  and  slowly  varying  at  infinity,  i.e.,  L1(tx)/L^(t)  — *  1 
as  t  — *  oo  for  every  x  e  Si. 

About  {£j}  we  shall  additionally  assume  that 

(5)  {^}  are  independent  of  {cj} 
and  either 

(6a)  ^2,....  are  i.i.d.  r.v.'s 


or 
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(6b)  {£j}  are  dependent  with  T(k)  =  k  2  Jf(k),  0  <  D2  <  1, 

where  is  a  p*p  matrix  of  slowly  varying  functions  at  infinity  and  -Z'(k)  are  positive 
definite  for  all  large  k. 

The  processes  that  have  covariances  like  (4)  or  (6b)  are  called  long  range  dependent. 
These  covariances  tend  to  zero  but  not  fast  enough  so  as  to  be  summable. 

In  the  case  when  errors  are  independent  or  weakly  dependent,  A ^(P^  -  P)  turns  out 
to  be  asymptotically  normally  distributed  where  equals  N*  in  the  case  {^}  are  i.i.d. 

'  i.  7 

r.v.'s  or  equals  (X  X)2  in  the  case  {^}  are  the  known  constants.  Here  X  X  = 

N 

&**'■ 

Recall  that  the  way  this  result  is  proved  is  first  to  approximate  P^~  P 

N  , 

by  {  S  X-X.  S (P).  Then,  by  the  LLN's,  the  first  term  in  this  approximation  is 

i=l  1  1  1 

—1  —1 

seen  to  be  of  the  order  N  and  this  N  is  split  so  as  to  stabilize  S(P)  and  P^~  P-  In  the 

ease  the  errors  are  independent  or  weakly  dependent  and  the  design  variables  are  random, 

the  scores  S{P)  are  of  the  order  Op(N1/2)  and  hence  one  must  have  an  “  N-  Note 

that,  in  view  of  the  Ergodic  Theorem,  the  first  term  in  the  above  approximation  is 
_ 1  / 

Op(N  )  as  long  as  the  summands  {X-X^  ip'(c j)}  are  stationary,  ergodic,  have  finite  first 

/  _ 1 

moments  and  (E[X^X^  0'(cj)]}  exists,  regardless  of  whether  the  r.v.'s  are  long  range 
dependent  or  not.  Hence,  even  in  the  present  case,  the  magnitude  of  S (P)  determines  that 
of  P^  -  p.  The  exposition  in  Section  2  below  uses  this  observation.  A  similar  observation 
is  used  in  Section  3  when  the  design  variables  are  the  known  constants. 

One  of  the  observations  of  this  note  is  that  the  class  of  M  -  estimators  corresponding 
to  the  skew  symmetric  scores  and  symmetric  errors  (i.e.  skew  symmetric  G)  asymptotically 
behaves  like  the  least  squares  estimator  under  (6a)  or  (6b)  or  the  known  constant  design 
case.  This  result,  in  the  cases  of  (6a)  and  (6b),  is  stated  and  proved  in  Section  2  and  in  the 
other  case,  in  Section  3,  below.  A  similar  observation  was  made  by  Beran  and  Kunsch 
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(1985)  in  connection  with  the  one  sample  location  model.  We  further  observe  that  in  these 
cases  if  the  design  variables  are  either  i.i.d.  or  known  constants  then  the  limiting 
distributions  are  Normal.  But  if  the  design  variables  are  also  strongly  dependent  and  there 
is  no  intercept  parameter  in  the  model  then  the  the  limiting  distributions  are  nonnormal 
and  appear  at  the  end  of  Section  2. 

In  what  follows,  L,  with  or  without  suffix  is  a  generic  notation  for  a  slowly  varying 
function.  All  limits  are  taken  as  N  -*  oo,  unless  mentioned  otherwise.  Also  in  most  of  our 
discussion  the  design  variables  need  not  be  Gaussian. 

2.  The  Case  of  Random  -  Designs.  A  preliminary  result  needed  for  obtaining  a  first  order 
approximation  to  M  -  estimators  is  the  asymptotic  uniform  linearity  of  S.  The  following 
theorem  gives  a  set  of  sufficient  conditions  for  such  a  result  to  hold.  It  also  gives  the 
required  approximation  to  M  -  estimators.  The  statement  of  the  Theorem  is  somewhat  self 
contained. 

/  / 

Theorem  1.  Let  (^,  e^),  ^  ^ e  a  S^C^V  stationary  sequence  of  random  vectors  with 

r  / 

being  p*l.  Let  Xj  =  (1,£.), 

Yj  =  Xj/?+Cp  for  some  /?  6  i  >  1. 

In  addition  assume  the  following. 

(a)  The  score  function  ip  satisfies  (1.3)  and  is  absolutely  continuous  with  a.e.  derivative 

ip'  satisfying  E\ip'  \  <  oo,  and  , 

EIIXjll2  I  tf'(c-*||X,||) -*-(€)!  -0  as  z  — «  0. 

(b)  For  N  >  p+1,  there  are  sequences  {A^}  and  {B^}  o/(p4T)*(p+l)  matrices  which  are 

positive  definite  for  sufficiently  large  N  and  satisfy 

(i)  ||B-1||-0,||A-1||-.0,  N.||A-1||-||B^1||  —  1  . 

(ii)  ||B-‘.s(0||=Op(l),. 
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Then ,  for  every  0  <  b  <  oo, 

(1)  E  sup  ||B-1[SW+A-1A)-S(«]  +  B-1EX.X:r(«i)AN1A||=o  (1) 

||  A||<b  i  p 

where  S  is  as  in  (1.2). 

o 

In  addition ,  i/  {^}  are  independent  of  {e^}  and  if  E||f||  <  oo,  then  the  random 
coefficient  of  A  in  the  linear  expansion  (1)  may  be  replaced  by  R-E^'(e)  where 
R  :=  EXjXj'  . 

Furthermore ,  if 

(c)  R-1  exists,  and  d)  0  <  Eip'(e), 
then 

(2)  An(^,-«  =  [R  E0'(e)|-1*B^1S(^)  +  op(l). 

Remark  1.  It  is  perhaps  worth  repeating  that  in  the  above  theorem  neither  {£j}  nor  { } 
need  be  Gaussian  or  functions  of  Gaussian  r.v.'s. 

Proof.  From  the  definition  of  S,  S(0  +  A)  =  £  Xj  -  Xj  A).  Now  ,  use  the 
definition  of  absolute  continuity  and  routine  arguments  to  get  that  the 

HA-fii 

L.H.S.(l)<bE£||BN1Xi|H|Xi||  1  llH«-z||X,||b) -*>'(<)  |dz 

1  -IIAn‘11 

IIan‘iI 

<  b  N  ||Bn1|I  I|An1||-  i  E[|Xj ||2 1  z||X|||b)  -  ^'(e)|dz  — >  0, 

-IIAn’II 

by  (a)  and  (b)(i). 

_ 1  /  _ 1 

The  claim  about  replacing  E  XjXj  by  R-E^'(e)  follows  from  the 

Ergodic  Theorem.  The  claim  (2)  is  obtained  from  (1),  (a)(ii),  (c)  and  (d),  with  the  help  of 
Scheweder  fixed  point  Theorem,  just  as  in  Huber  (1981).  □ 
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Remark  2.  Observe  that  ip(x)  e  x  a  priori  satisfies  (a).  Another  example  of  ip  satisfying  (a) 

is  the  Huber  function  ip(x)  :=  xl(  |x|  <c)  +  c  sgn(x),  c  >  0,  provided  {^}  are  independent 
2 

of  {fj},  E||£||  <  oo,  and  F  is  continuous  at  ±c.  To  see  this  observe  that  for  this  ip  the 
L.H.S.(a)  <  EIIXjIlVlc+zllX,!!)  -  Ffc^UXjll]  +  (F(-c+z||XL||)  -  F(-c-^||X1||)]}. 


Now  the  Dominated  Convergence  Theorem  gives  the  claim.  □ 

Observe  that  so  far  we  have  not  used  (1.4)  or  (1.6a)  or  (1.6b)  or  even  the  assumption 
about  { 77j }  being  Gaussian.  We  shall  now  use  these  assumptions  to  determine  the 
sequences  of  matrices  {A^}  and  {B^}.  The  main  requirement  on  B^  is  (b)(ii).  Once 
B^  is  determined,  can  be  determined  from  (b)(i). 

In  order  to  assess  the  magnitude  of  S  (write  S  for  S(0))  we  shall  use  the  Hermite 
expansion  of  L2(5E;d<I>)  functions.  What  follows  about  Hermite  expansions  etc.  is 
borrowed  from  Feller  (1971)  and  Taqqu  (1975).  With  {Hq,  q  >  0}  denoting  the  Hermite 


polynomials,  let  J  :=  EipJr])U  (77),  where  ip,  =  ip{G).  Let  m  :=  min{q>l,  J  ^  0} 

denote  the  Hermite  rank  of  ^(77).  The  Hermite  expansion  of  rank  m  of  ip^(rj)  is  given  by 
00  J 

2  q?HqW 

q=m H  H 


Recall  from  Feller  (1971)  that  {H  (?7j)}  is  a  set  of  orthonormal  r.v.'s  in  L 2($;  d<l>) 
satisfying 

(3)  Hq(x)  =  1,  EHq(V)  =  0,  q  >  1; 


“q  W’j)  = 


1  Vi,  j. 


0,  q  ^  n 

q!pq(H),  q  =  n 

Now,  we  begin  the  argument  for  determining  B^  and  A^.  For  a  A  6  Rp+1,  write  A 
=  (ApA2),  Ap  6  A2  €  St?.  From  (1.5)  and  (3),  V  A  e  #p+1, 


E[A‘ 


N  N 


E  XjH 


JW  =  m!  E  E  X  +  X  r(i-j)  ajAh) 

1=1 J=1  1  z  c 


(4) 
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At  this  point  we  need  to  consider  (1.6a)  and  (1.6b)  separately. 
Suppose  that  (1.6a)  holds.  Then  the 


RHS(4)  =  m![Aj  E  E  /9m(i-j)  +  A2A2  N] 

Now,  if  we  restrict  D,  <  1/m,  then  from  Taqqu  (1975;  Lemma  3)  it  follows  that  the 


T 


9  2-mD, 

RHS(4)  ~  c^N  AL(N)  +  m!  A9A2  N. 


where  c^  is  a  constant  depending  on  and  m.  Thus  in  this  case  if  we  choose 


(5) 


Bn  = 


'N  XL(N) 

^lxp 

2; 

X> 

0 

f - 

o 

X 

N*I 

pxpj 

1 

O 

BN2 

say, 


with  2H^  =  2-mDp  then  we  see  that 

(6)  E(a'b-'  S  XjHm(^))2  =  0(1)  V  A  e  ap+1 


We  note  that  <  1/m  implies  that  {t^fj)}  are  also  long  range  dependent.  The  case 
Dj  >  1/m  would  yield  that  these  r.v.'s  are  asymptotically  weakly  dependent  and  not 
interesting  to  us  from  the  current  point  of  view. 

Now  suppose  that  (1.6b)  holds.  Then  the 

0  -mD,  ,  -D0-mD, 

RHS(4)«m![Af  EE|H|  xL(i-j)  +  E  E  A9Jf(i-j)  A9  j i-j I  ] 

I  i  j  i  j  J 

Note  that  J?  being  a  matrix  of  slowlv  varying  functions  at  infinity  and  that  Jf(k)  being 
positive  definite  for  all  large  k,  it  follows  that  for  every  A2  6  $P,  A2^A2  is  slowly 
varying  at  infinity  and  that  A2  Jf(k)  A2  >  0  for  all  large  k  and  for  every  A2  6  dP. 
Once  again,  use  the  arguments  as  in  Taqqu  (op  cit.)  to  conclude  that  the 

9  2-mD,  ,  2-mD,-D0 

R.H.S.(4)  ~  c^N  1 +c2A2Jr(N)  A2N  1  z 
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provided  we  assume 


(7)  0  <  Dj  <  1/m,  mD^+D2  <1,  0  <  D2  <  1. 

Here  and  c9  are  constants  depending  on  m,  and  D2.  Thus  a  choice  of  here  is 


(8) 


BN  = 


•N  1 

0  lxp 

H 

L(N)  = 

bNl 

0 

°pxl 

no 

N  „ 

pxpj 

0 

BN2 

,  say 


with  as  in  (5)  and  2H9  =  2  -  mD^  -  D9. 

With  this  B^,  one  can  again  verify  that  (6)  above  holds  in  this  case.  Note  that  (7) 
implies 


(9) 


1/2<H1<1,  1/2  <  H9<  1. 

Next,  in  view  of  (1.3),  (1.5)  and  (3),  V  A  e 

J  00 

eia'b-1  JVH2  =  E{E  a'b-'x,  s 

in  i  1  I  1  rn.  m  1  }  q=m+l 


qM‘ 


J 

=  S  jESEA'B-'XjXjB-'A- AH) 
q=m+l  i  j 


(10) 


< 


00 

E 

q=m 


-1 


•(J  r(H)l'BNlA  l/>m+1(i-j)l 


“ *  0, 


by  arguing  as  in  Taqqu  (1975,  p.  294),  under  both  (1.6a)  of  (1.6b),  using  B^  as  in  (5)  or  in 
(8),  as  the  case  may  be. 

Combining  (5),  (6),  (8),  and  (10),  one  sees  that  under  either  (1.6a)  or  (1.6b)  (with 
B^  as  in  (5)  or  as  in  (8),  respectively)  one  has,  by  (3), 
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Var(A  BN'  E 


=  Varp  (A  Hmty)}]  +  Var^  E(A  Bj/tyH  (»,)] 

i  ‘  'i 

=  o(l) +  0(1)  =  0(1). 

This  then  determines  and  verifies  the  assumption  (b)(ii)  of  the  Theorem  1  above  when 
Uj},  {/?)},  {f-,}  are  as  in  (1.1),  (1.4),  (1.5),  (1.6a)  or  (1.6b).  Now,  if  BN  is  given  by  (5), 
then 


(11) 


AN  ~ 


'  N1  HL(N)  0 

aNl  0 

0  N’l 

L  pxp  J 

0  AN2 

,  say, 


will  satisfy  (b)(i).  If  B^  is  given  by  (8),  then 

1-H, 


(12) 


AN  = 


N 

0 


T 


0 

i-h9 

N  Z I 


pxp 


L(N) 


aNl  0 
0  A 


N2 


,  say, 


will  satisfy  (b)(i),  with  and  H2  as  in  (5)  and  (8)  satisfying  (9). 
The  above  discussion  is  now  summarized  as 


Theorem  2.  Let  { Yj},  {^},  {^},  /?,  ^  satisfy  (1.1),  (1.3),  (1.4),  (1.5),  and  (1.6a)  or  (1.6b). 
In  addition  assume  that  ij)  is  nondecreasing  satisfying  (a)  of  Theorem  1  with  0  <  Eip'(e). 
Then ,  with  defined  as  a  solution  of  (1.2), 

An(^,-/J)  =  (R  E^wr1  B^1  E  XiHm(,i)  ^  +  Op(l). 

where  B^,  are  as  in  (5),  (11),  ((8),  (12))  in  the  case  of(  1.6a)  ((1.6b)). 
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Remark  4.  Hermite  rank  m  of  ipy  Often  the  function  0  is  chosen  to  be  skew  symmetric, 
viz,  0(-x)  =  0(x).  Thus  if  G  is  also  skew  symmetric  then  ^(-x)  =  tp(G(-x))  =  ^-G(x))  = 
~V>(G(x))  =  -^(x).  In  such  cases,  using  the  fact  that  Hq(-x)  =  (-l)qHq(x)  for  all  q,  we 
have 

Jq  =  E^(,)Hq(,)  =  (l+(-l)(l+1l-E{i>1(,)Hq(,)I(!)  >  0)}  t  0,  q  =  1. 
Therefore,  m  =  1,  Jj  =  2  E{0j(j;)j/I(j/>O)},  1^(77)  =  r)  and,  from  Theorem  2, 

(13)  An(^j-«  =  [RE<Me1)r1.  B"1  E  Xjflj  •  Jj+Optl). 

Now  let  be  the  least  squares  estimator  of  0  in  (1.1).  Then  carrying  out  an 

analysis  like  the  above  one  can  derive  the  following: 

2 

If  EG(t/)  =  0,  0  <  EG  (77)  <  00  and  G  is  skew  symmetric,  then 

^N^^N  ~  ®  ~  ^  ^  ’  ®N  ?  *  °i  "**  °p(^)» 

where  arx  =  EG(t7)t7  where  AN  and  are  the  same  as  in  (13). 

The  r.v.  EX^  is  precisely  the  leading  term  in  tne  least  squares  estimator  of  the 
regression  parameter  with  the  errors  {7^}  and  the  design  vectors  {Xj}.  Thus  it  follows  that 
the  above  class  of  M  -  estimators  corresponding  to  the  skew  symmetric  scores  and 
symmetric  errors  are  asymptotically  like  the  least  squares  estimators  regardless  of  whether 
the  errors  are  Gaussian  or  not. 

Now,  suppose  that  there  is  no  intercept  parameter  in  (1.1).  Then  the  result  like  (13), 
with  Xj  replaced  by  A^,  BN  replaced  by  A^2>  B^2  and/or  (8)  remains  valid. 
Of  course  now  0^  is  p*l  as  is  0.  Note  that  in  the  case  of  (1.6a), 

A'b^  £  ^  =>  Np(0,  A'r  A),  ¥  A  E  tfp, 

where  T  =  T(0)  =  Ef^. 
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But  in  the  case  of  (1.6b)  the  limiting  distribution  is  different.  To  determine  this 
limiting  distribution,  we  use  Theorem  6.1  of  Fox  and  Taqqu  (1987).  Observe  that  if  {^} 
are  long  range  dependent  and  Gaussian  then  so  are  the  r.v.'s  {A  for  every  A  € 

with  the  same  exponent  D9  as  in  (1.6b).  Now,  take  Xj  and  Yj  in  Theorem  6.1  of  Fox 

/ 

and  Taqqu  to  be  A  and  77. ,  respectively.  One  then  sees  that  (1.4),  (1.5)  and  (1.6b) 
together  with  the  Gaussianness  assumptions  imply  all  the  conditions  of  that  Theorem  for 
every  A.  Hence, 

— H 

a'b~‘  E  =  N  2  L(N)  E  =>  Z(l).(A'rA)* 

with  Z(l)  obtained  from  (6.1)  of  Fox  and  Taqqu  after  t  is  set  equal  to  1  in  there.  □ 

3.  The  Case  of  Non  -  Random  Designs.  In  order  to  seperate  this  case  from  that  of  the 

random  designs,  we  shall  now  denote  an  Nxp  design  matrix  of  known  constants  by  C 
/ 

and  its  ith  row  by  CNi’  1  -  1  -  N.  Consider  the  linear  regression  model  where  one  observes 
{YNi}  satisfying 

(1)  VNi=cNi'9+€i’ 
with  {fj}  as  in  (l.i). 

Throughout  we  shall  assume  that 

(LI)  (C  C)-1  exists  for  all  N  >  p. 

The  class  of  M  -  estimators  is  defined  as  a  solution  t  of 

(2)  T(t)  :=  E  cNi^(^Nj  ~  CNi^  = 

where  0  is  assumed  to  satisfy  (1.3).  Again,  our  objective  here  is  to  investigate  the  large 
sample  behavior  of  these  estimators  when  {7^}  satisfy  (1.4).  Of  course  conceputally  the 
discussion  that  follows  is  similar  to  that  in  Section  2  above  except  for  the  difficulties 
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created  by  the  nonstationarity  that  is  introduced  in  the  problem  by  {c^}.  We  begin  by 
giving 

Theorem  1.  Let  £pt9,...  be  a  strictly  stationary  sequence  of  r.v's  and  C  be  as  above 
satisfying  (LI)  and  assume  (1)  above  holds.  In  addition ,  assume  that  the  following  hold . 

(L2)  The  score  junction  \p  is  absolutely  continuous  with  its  almost  everywhere  derivative  y )' 
satisfying  E|  ip'(e)\  <  oo  and  such  that  the  function 
z  — *  E|  0'(e-z)-0'(€)  |  is  continuous  at  zero. 

(L3)  There  exists  sequences  {A^}  and  {B^}  of  p*p  matrices  such  that  they  are  positive 
definite  for  sufficiently  large  N  and  satisfy 

(0  II A^1  II  -  0,  IIB^II  -  0;  (ii)  B-VcA-1  =  Ipxp 

(»>)  max  HA-'cn-II  -  0,  (iv)  ||B^T(/J)||  =  Op(l). 

Then ,  for  every  0  <  b  <  oo, 

(3)  E  llAilSb118^1  W/fc-A^A)  -  T(/J)]  +  Bn‘  S  cNic|i^'(fi)AN1A||  =  0(1). 

If  in  addition,  fj  s  G( 77- ),  with  {^}  satifying  (1.4), 

(L4)  ip  is  nondecreasing,  0<E^'(e),  E(^'(c))2  <  00,  and 

(L5)  N1_(D/2)  imaXN||B-1cNi||  •  HA^tyi  -  0,  with  D  =  Dl  of  (1.4) 

then 

(4)  cNicNi  ^,(fi)‘AN1  =  +  V1^’ 

and 

(5)  AN(^-fl  =  |E^(<)r1-B-1T(/J)  +  0p(l). 

Remark  1.  Some  comments  about  the  assumptions  are  in  order.  The  assumptions  (L2)  and 
(L3)  are  similar  to  the  assumptions  (a)  and  (b)  of  Theorem  2.1  above.  Recall  that  in  the 


linear  regression  model  with  independent  or  weakly  dependent  errors  and  with  the  design 

f  X 

matrix  C,  the  magnitude  of  T(/?)  is  of  the  order  ^  :=  (C  C)2.  However  in  the  current 
situation,  where  {fj}  are  functions  of  long  range  dependent  r.v.'s,  we  can  not  expect  this 
magnitude.  But  we  must  still  have  (L3)(ii)  in  order  to  stabilize  the  LHS(4). 

In  the  case  of  random  and  stationary  design  variables,  as  in  Section  2  above,  an 
analogue  of  (4)  is  given  by  the  Ergodic  Theorem  which  does  not  require  the  second  moment 
of  the  summands.  But  in  the  present  situation,  the  LHS  of  (4)  is  neither  stationary  nor 
independent.  The  assumptions  (L4),  (L5)  and  (L3)(ii)  together  with  the  Gaussianness  of 
{rj j}  is  used  to  conclude  (4)  below. 

Proof.  To  simplify  writing,  let  a^  :=  bj  :=  Bjvj^Ni’  1  <  i  <  N.  Now,  by  the 

absolute  continuity  of  tj),  the  Fubini  Theorem  and  the  Cauchy-Schwarz  inequality,  the 

llajll 

LHS(3)<2bE||bi||||ai||{2||ai|ir1  /  E|^(«-zb)  -  tU)\& 

1  Hlajll 

llajll 

<  2b(S  llb-ll2  S  ||a-||2)’*  max[(2||ainr1  /  E|*'(«-zb)  -  *'(e)|ds]  -  0, 
i  i  i  -||aj|| 

by  (L2),  (L3)((i)  -  (iii)).  Note  that  by  (L3)(ii), 

S  llbjll2  E  ll^ll2  =  tr.B-'c'CB-'-A-VcA-1  =  p  =  0(1) 

where  tr.A  :=  trace  A  for  any  matrix  A. 

Next,  let  02(^)  :=  ^'(0  =  and  :=  (r?)Hq(r?).  In  view  of  (L4),  the 

oo  Q: 

Hermite  expansion  of  tyJrj)  is  E  -f  H  (rj.).  Also  note  that  the  LHS(4) 

*  1  L  q=l  q  1 

above  is  now  E  b-a-  Hence  V  A  e  #p, 

*  11  1 


by  (2.3), 


00  /  /  /  n 

=  E  4EEAbiaiilibji/(H) 
q=l  q-  i  j  1  1  J  J 

<  Var  (<1>M)  •  ||A||2  max  ||b.a.||2  EE  |/KH)I- 

i  i  j 

(6)  =  max  {||B-|cNi||-||A-1cNi||}2  ■  OfN2-0), 

2— D 

because  S  S  |^(i-j)j  =  0(N  )  and  because 

i  J 

||bjaj||2  =  trfbjajajbj)  =  MbjbjHaJa,) 

=  llbjH2  Ha.,112  =  l|B-lcN,||2  SIA-'c^H2. 

Therefore  (4)  follows  from  the  assumption  (L4)  and  (6).  The  result  (5)  follows  as  in  Huber 
(op  cit.).  □ 


Our  next  objective  is  to  determine  B^,  using  (L3)(iv).  Again,  to  simplify  exposition 
we  shall  write  {Cj}  for  {c^j}.  Proceeding  as  in  Section  2,  we  observe  that  V  A  6  9tP, 

E{A'E  c-iM,.))2  =  E{a'e  C-H  (ifc).iji}2  +  E{A  E  c-fety)  -  H  (,.)-l!?]}2 
i  i  ’  i 

=  E<*'f  +  EfA'f  ci  q>E+1  J?  V’i'l2 

=  a'e  E  c.c|  pm(i-j)  •'•firr  +  s  3  E  E  a'c-cJa-AH) 
i  j  1  J  m-  q>m+l  q‘  i  j  1  J 


HTrA'KN1A-fAKN2A 


where 


Kn.  :=  E  E  Cjc!  AH),  Kn„  :=  E  -?  E  E  cc',  AH)- 

m  i  j  J  z  q>m+l  q>  i  j  1  J 
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At  this  point  one  is  clearly  persuaded  to  choose  a  and  then  try  to  show  that 
|| K  — *  0  so  that  we  would  have  (L3)(iv)  satisfied.  Such  a  process,  though 

feasible,  appears  to  be  quite  involved  for  general  {Cj}.  However,  if  we  make  some  further 
assumptions  on  the  design  variable  then  this  process  is  less  involved  and  more  transparant. 

Accordingly,  let  p  :=  )  be  a  vector  of  measurable  functions  on  [0,1]  to  & 

satisfying  the  following  conditions: 


(al)  With  D  =  and  L  as  in  (1.4),  m  as  the  Hermite  rank  of  ipArj), 

1  1-u  _  n 

0)  /  /  IfyOO  <pk(u+v)v  mUL(v)|  dv  du  <  oo,  D  <  1/m, 

1 

00  /  l^(u)  <Pk(u)|  du  <  oo,  f,k  =  l,2,...,p. 

(a2)  (i)  N~°/4  max  ||v?(i/N)||  — ►  0;  (ii)  N~1+mD  max  ||v?(i/N)||2  — ►  0. 

1  <  i <N  1 < i<N 

(a3)  The  matrix  $  *  exists,  where 

1  1  _mn 

9  =  ((%))*  gflc  =  //^u)^(v)lv~ul  UL(|v-u|)du  dv,  l,  k=l,...,p. 
Given  such  a  collection  of  p's,  choose 


(8) 

Now  observe  that 


cj  :=  v(i/N),  1  <  i  <  N. 


so  that 
(9) 


-1  '  -1  '  i-i/N  t  1  t 

N  C  C=N  £  c-C;  «  /  y?(u)¥>t(u)du  — >  /  y(u)^Tu)du, 

i  1  1  1/N  0 


N  c  C  — *  0,  because  -1  +  mD  <  0. 


From  (1.4),  (8)  and  the  slowly  varying  property  of  L  it  follows  that 


P  P 


A  KN1 A  =Z  uE  Xf  Ak  S  S  ^i/N)^(j/N)  AH) 

c— 1  k— 1  i  j 
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=  ACC  A  +  2^  \t  Ak  E^E  V^i/N^G/N)  pm(j-i) 

a  N2-mD  2  E  E  A/A.  /  V^>)<^(u+v)v“mDL(v)dudv 
i=\  k=l  1  K  0  0  1  K 


=  N2  mD  \  f  A. 


Now  let 

(10)  Bn  :=  NH  j?1/2,  H  =  l-(mD/2),  D  =  T>1  of  (1.4). 

Our  next  objective  is  to  show  that  the  second  term  in  the  RHS(7)  is  0(N  ).  To 

that  effect,  note  that  q  >  m+1,  |p(k)|  <  1,  Vk  >  1,  imply  that 

(11)  |  AS  E  c-CjPq(j-i)  A|  <  £  £  |A-A.|E  S  I  v/i/NJ^j/N)  #>m+1(j-i)  | . 

i<  j  1  J  *=1  k=l  c  K  i<j  1  K 

Now,  since  |  p(k)  |— •  0  as  k  — *»,  V  €  >  0  3Nf  such  that  |  p(k)  |  <  e  V  k  >  Nf.  Hence, 

V  N  >  Nf, 

|^i/N)^(j/N)r+1(j-i)l  <  ^  I  vyO/N^O/N)  I 

+  cE  E  k.O/NWj/N)  AH)I 
i<ji(H)>Nt  1  k 

=  ^N1  +  f'TN2’  say' 

But,  V  l,  k  =  l,...,p, 

tN1  *  Ne'N  •  mf  IM‘/N)II2  =  o(N2-mD),  by  (a2)(ii), 

TN2<£  E  |^(i/N)^(j/N)|  |  AH  I 

«  N2-mD  /  /  |*>,(u)w[(v+u)v-mDL(v)|du  dv  =  0(N2-mD),  by  (al)(i). 
0  0  1  K 
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Hence,  V  t  >  0,  3  N  such  that 

(12)  LHS(ll)  <  o(N2_mD)  +  £-0(N2_mD),  VN>N(. 

From  (9),  (12)  and  the  definition  of  it  follows  that  V  N  >  N  , 

N-2+mD  |A'kn2  a|  <  Var  ^(f/H |A'c'C  A|  +  LHS(ll)} 

<  o(l)  +  €-0(1)  — *  0,  by  now  letting  e  — ♦  0. 

It  thus  follows  that  (L3)(iv)  holds  with  given  by  (10).  From  (L3)(ii)  we  get 

(13)  AN  =  B-1-c'c»N1-H^-1/2/J(,vt. 

Note  that  m  >  1  => 

max  IIAT^c-H  »  max  N-mD/2||y)(i/N)||  <  N_D/4  max  ||y>(i/N)|| 
i  1  i  i 

and 

n‘-D/2  max  [|| A^'cNiH  •  l|B^‘cN j||]  *  |N“D/4  max  |Mi/N)||]2 

so  that  (a2)  implies  (L3)(ii)  and  (L5).  This  shows  that  all  the  assumptions  of  Theorem  1 
are  satisfied.  We  now  summarize  the  above  discussion  as 

Theorem  2.  Suppose  that  the  linear  regression  model  (1),  with  errors  as  in  (1.1)  and  (1.4), 
holds.  About  the  design  variables  {c^}  and  the  score  function  ip  assume  that  (8),  (al)-(a3), 
(1.3),  (L2)  and  (L4)  hold.  Then  M  -  estimators  {/^}  defined  as  solutions  of  (2)  satisfy 

(14)  N1-H(^,-/J)  =  {m!-/Jwt-E^(0rl.N-Hi:v(i/N)Hm(^1)-Jm  +  op(l), 

where  H  =  (l-mD/2),  D  =  o/(1.4). 

Remark  2.  Observe  that  if  the  design  generating  functions  are  bounded  then  0<D<l/m 

/ 

guarantees  the  satisfaction  of  (al)  and  (a2).  In  particular  if  ^  (uj  =  u  ,  t  =  l,...,p,  then 
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(al)  -  (a3)  are  all  satisfied.  That  is,  all  of  these  conditions  are  satisfied  in  the  case  of  the 
order  polynomials. 

An  example  of  an  unbounded  design  is  obtained  by  taking  p=l,  i/^(u)  =  u-r,  r>0. 
Then  (al)-(a3)  are  satisfied  as  long  as  r  <  (l-mD)/2. 

Remark  3.  An  analogue  of  Remark  2.4  applies  here  also  with  obvious  modifications. 
Consequently,  for  skew  symmetric  $  and  symmetric  errors  the  asymptotic  distribution  of 

ITT  * 

N  -  (f)  is  p  -  variate  Normal  with  mean  vector  0  and  the  covariance  matrix 

[{/J  W1}-1?  {/o  wW  ( E0'(t)}-2  4 
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